================================================================================
README.txt - Matrix Product Approximation Experiments (Figure 1)
================================================================================

### PROJECT TITLE
Matrix Product Approximation Experiments: Evaluating Algorithms and Bounds with Target Rho_G Control (Figure 1)

### OBJECTIVE
This project implements and evaluates various algorithms for approximating the matrix product A @ B.T. It focuses on comparing the actual approximation errors of these algorithms against several theoretical bounds, including user-defined QP-based bounds and standard bounds from the literature. The primary goal is to analyze the performance of these methods under different conditions, particularly varying the "sparsity" parameter k (number of selected columns/features) and the characteristics of the input matrices, which are controlled by generating matrices with specific target Rho_G values. This version is specifically configured for "Figure 1" detailed in the `run_experiment1.py` script.

### FILE STRUCTURE
The project consists of two main Python files:

1.  `matrix_product_approximations_exp1.py` (Core Library):
    *   This file contains all the core logic and functionalities.
    *   Includes:
        *   Matrix generation functions, with a special emphasis on generating matrices (A, B) to achieve target Rho_G values.
        *   Implementations of matrix product approximation algorithms:
            *   Leverage Score Sampling
            *   CountSketch
            *   SRHT (Subsampled Randomized Hadamard Transform)
            *   Gaussian Projection
            *   Greedy OMP (Orthogonal Matching Pursuit)
        *   Functions to compute various theoretical error bounds:
            *   User-defined QP-based bounds (Analytical and CVXPY-optimized versions).
            *   User-defined Binary bound.
            *   Standard bounds from literature (e.g., Leverage Score Expectation, Simple Sketching).
        *   A function to compute the optimal v_k^* error (combinatorial, typically for small 'n' due to complexity).
        *   Utilities for running experiments and managing data.
        *   Plotting functions to visualize results, including relative squared errors and bounds against k/n, often generating separate legend files for clarity.

2.  `run_experiment1.py` (Experiment Script):
    *   This script is designed to execute "Figure 1".
    *   It imports necessary functions from `matrix_product_approximations_exp1.py`.
    *   Allows configuration of experiment parameters directly within the script, such as:
        *   Matrix dimensions (m_dim, p_dim, n_dim).
        *   The list of k values (sparsity levels) to be tested.
        *   Target Rho_G values for matrix generation.
        *   Number of trials for randomized algorithms.
        *   Flags to enable or disable specific computations (e.g., v_k^*, certain algorithms, or bounds).
    *   Orchestrates the experiment flow: matrix generation, algorithm execution, bound computation, and result plotting/saving.

### SETUP / REQUIREMENTS
1.  **Python**: Python 3.x is required.
2.  **Libraries**: The project depends on the following Python libraries:
    *   `numpy`: For numerical operations.
    *   `matplotlib`: For generating plots.
    *   `scipy`: For scientific computing functionalities.
    *   `cvxpy`: For solving convex optimization problems (used for one of the QP bounds).
    *   `pandas`: For data manipulation (e.g., if saving results to CSV).
    *   `tqdm`: For displaying progress bars during execution.
3.  **Installation**:
    *   Ensure Python 3 and pip are installed on your system.
    *   It is highly recommended to use a Python virtual environment to manage dependencies.
    *   Install the required libraries using pip:
        `pip install numpy matplotlib scipy cvxpy pandas tqdm`

### EXECUTION INSTRUCTIONS
1.  Save both `matrix_product_approximations_exp1.py` and `run_experiment1.py` files into the same directory on your computer.
2.  Open the `run_experiment1.py` file using a text editor.
3.  Modify the parameters within the `if __name__ == "__main__":` block at the end of the `run_experiment1.py` file to configure Figure 1 according to your needs (see "CONFIGURATION" section below).
4.  Open a terminal or command prompt.
5.  Navigate (using `cd`) to the directory where you saved the Python files.
6.  Execute the experiment script using the Python interpreter:
    `python run_experiment1.py`

The script will then proceed to:
*   Generate matrices based on the specified target Rho_G values.
*   Run the suite of approximation algorithms and compute theoretical bounds for each configured k value.
*   Print progress information and any warnings to the console.
*   Save the resulting plots and, if enabled, the numerical data.

### CONFIGURATION (within `run_experiment1.py`)
Key parameters for Figure 1 can be adjusted directly in the `if __name__ == "__main__":` block of `run_experiment1.py`:

*   **Matrix Dimensions**:
    *   `m_dim`: Number of rows in matrix A.
    *   `p_dim`: Number of rows in matrix B.
    *   `n_dim`: The common dimension (number of columns in A and B).
*   **Sparsity Levels (k)**:
    *   `k_values_exp1`: A list or array of k values to test.
*   **Matrix Characteristics**:
    *   `target_rho_g_values_exp1`: A list of target Rho_G values for which matrices will be generated and experiments run.
*   **Experiment Control**:
    *   `num_trials_exp1`: Number of trials for randomized algorithms.
    *   Flags to enable/disable parts of the experiment, such as:
        *   `run_optimal_vk_star_exp1`
        *   `run_algorithms_exp1`
        *   `run_bounds_exp1`
*   **Output Control**:
    *   `save_plots_exp1`: Boolean to enable/disable saving plots.
    *   `save_results_to_csv_exp1`: Boolean to enable/disable saving numerical results to CSV files.
    *   `plot_file_prefix_exp1`, `results_dir_exp1`, `plots_dir_exp1`: Define naming and locations for output files.

### OUTPUT
The script generates the following outputs:

1.  **Console Output**:
    *   Progress bars (via `tqdm`) for matrix generation and experiment iterations.
    *   Information about the generated matrices, including the actual Rho_G achieved versus the target.
    *   Warnings or error messages if issues are encountered during matrix generation, algorithm execution, or bound computation.

2.  **Plots**:
    *   Plots are saved in a `plots/` subdirectory (this directory is created automatically if it doesn't exist).
    *   A key output is typically a combined plot (e.g., `exp1_rho_comparison_plots.png`) that displays the relative squared error versus `k/n_dim` for each target Rho_G scenario. Each subplot in this figure usually corresponds to one Rho_G value, allowing for comparison across different matrix structures.
    *   A separate legend file (e.g., `exp1_rho_comparison_legend.png`) is often generated to accompany the main plot, ensuring the plot itself remains uncluttered.
    *   These plots visualize the performance of the actual algorithms against the various theoretical bounds.

3.  **Data (Optional)**:
    *   If `save_results_to_csv_exp1` is set to `True` in `run_experiment1.py`, the raw numerical results (errors, bound values for each k and Rho_G) can be saved as CSV files.
    *   These CSV files are typically stored in a `results/` subdirectory (created automatically).

### LICENSE
This project is licensed under the MIT License, as indicated in the source code comments.

### NOTES & KEY FEATURES
*   **Target Rho_G Generation**: A significant feature is the ability to generate matrices A and B aiming for specific `Rho_G = trace(A.T A * B.T B) / sum(A.T A * B.T B)` values, allowing for controlled study of matrix structural effects.
*   **Optimal v_k^* Computation**: The `compute_optimal_vk_star` function calculates the true minimum possible error for a given k. This is computationally intensive and often skipped for larger `n_dim` or `k` values (configurable threshold).
*   **SRHT Implementation**: The Subsampled Randomized Hadamard Transform (SRHT) implementation includes a manual Fast Walsh-Hadamard Transform (FWHT) and handles matrix padding to the next power of 2 if necessary.
*   **CVXPY Solver**: Ensure `cvxpy` is installed with a suitable solver. Common solvers like SCS or ECOS are often included by default with `cvxpy` installation and should work for the QP-based bound.
*   **Relative Squared Error**: The plots typically display relative *squared* error, which is `(||AB.T - C W.T||_F / ||AB.T||_F)^2`.

================================================================================